Fast Matching of Twig Patterns
نویسندگان
چکیده
Twig pattern matching plays a crucial role in xml data processing. Existing twig pattern matching algorithms can be classified into two-phase algorithms and one-phase algorithms. While the two-phase algorithms (e.g., TwigStack) suffer from expensive merging cost, the onephase algorithms (e.g., TwigList, Twig2Stack, HolisticTwigStack) either lack efficient filtering of useless elements, or use over-complicated data structures. In this paper, we present two novel one-phase holistic twig matching algorithms, TwigMix and TwigFast, which combine the efficient selection of useful elements (introduced in TwigStack) with the simple lists for storing final solutions (introduced in TwigList). TwigMix simply introduces the element selection function of TwigStack into TwigList to avoid manipulation of useless elements in the stack and lists. TwigFast further improves this by introducing some pointers in the lists to completely avoid the use of stacks. Our experiments show TwigMix significantly and consistently outperforms TwigList and HolisticTwigStack (up to several times faster), and TwigFast is up to two times faster than TwigMix.
منابع مشابه
TwigList : Make Twig Pattern Matching Fast
Twig pattern matching problem has been widely studied in recent years. Give an XML tree T . A twig-pattern matching query, Q, represented as a query tree, is to find all the occurrences of such twig pattern in T . Previous works like HolisticTwig and TJFast decomposed the twig pattern into single paths from root to leaves, and merged all the occurrences of such path-patterns to find the occurre...
متن کاملA Hybrid Approach for General XML Query Processing
The state-of-the-art XML twig pattern query processing algorithms focus on matching a single twig pattern to a document. However, many practical queries are modeled by multiple twig patterns with joins to link them. The output of twig pattern matching is tuples of labels, while the joins between twig patterns are based on values. The inefficiency of integrating label-based structural joins in t...
متن کاملMARS: A Matching and Ranking System for XML Content and Structure Retrieval
Structural queries specify complex predicates on the content and the structure of the elements of tree-structured XML documents. Recent works have typically applied top-down decomposition of the twig patterns into (i) parent-child or ancestordescendant relationships, or (ii) path expression queries, and then followed by a join operation to reconstruct matched twig patterns. This demonstration s...
متن کاملQuickStack: A Fast Algorithm for XML Query Matching
With the increasing popularity of XML for data representation and exchange, much research has been done for providing an efficient way to evaluate twig patterns in an XML database. As a result, many holistic join algorithms have been developed, most of which are derivatives of the well-known TwigStack algorithm. However, these algorithms still apply a two phase processing scheme: first identify...
متن کاملPrefix Path Streaming: a New Clustering Method for XML Twig Pattern Matching
Searching for all occurrences of a twig pattern in a XML document is an important operation in XML query processing. Recently a class of holistic twig pattern matching algorithms has been proposed. Compared with the prior approaches, the holistic method avoids generating large intermediate results which do not contribute to the final answer. The method is CPU and I/O optimal when twig patterns ...
متن کامل